fix: auto-refresh OIDC token with proactive + reactive strategy#17
fix: auto-refresh OIDC token with proactive + reactive strategy#17
Conversation
- Proactive refresh: refresh when <15min remains on the token TTL (~75% of a 1h token), preventing latency spikes on heartbeat/event paths - Reactive refresh: retry once with forced refresh on 401 (handles clock skew, server-side revocation) - Fix body consumption bug: use req.GetBody to obtain a fresh body for the retry request, since req.Clone shares the original io.ReadCloser - Persist refreshed tokens to keyring so other processes see them Closes #9 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ffd4aee to
293a865
Compare
…sh_retry_on_401 # Conflicts: # internal/run/run.go
| } | ||
| r2 := req.Clone(req.Context()) | ||
| r2.Header.Set("Authorization", "Bearer "+token) | ||
| return t.base.RoundTrip(r2) |
There was a problem hiding this comment.
🔴 401 retry sends request with consumed (empty) body, breaking retry for all POST requests
When bearerTransport.RoundTrip retries on a 401, it clones the original req to create r2. However, req.Body was already consumed by the first t.base.RoundTrip(r) call — http.Request.Clone performs a shallow copy of Body, so both r and req share the same underlying io.ReadCloser. After the first round trip reads and exhausts the body, req.Body is at EOF. The second clone r2 therefore gets the same consumed reader, causing the retry to send a POST with an empty body.
Since all ConnectRPC unary RPCs use POST with a serialized protobuf payload, every 401 retry will send an empty body to the server, which will fail with a deserialization or protocol error rather than succeeding with the refreshed token. The entire retry-on-401 feature is effectively non-functional.
Fix: use GetBody to obtain a fresh body reader for the retry
The fix is to call req.GetBody() (which http.NewRequest sets for *bytes.Buffer, *bytes.Reader, and *strings.Reader bodies) to obtain a fresh reader for the retried request.
Prompt for agents
In bearerTransport.RoundTrip (internal/backend/backend.go), the 401 retry path clones the original request with req.Clone(), but at that point req.Body has already been consumed by the first t.base.RoundTrip(r) call. Clone only does a shallow copy of Body, so r2.Body is the same exhausted reader.
To fix this, after cloning req into r2, reset r2.Body using req.GetBody() (which http.NewRequest sets for common body types like *bytes.Buffer, *bytes.Reader, *strings.Reader). If GetBody is nil (unlikely for ConnectRPC but defensive), return the original 401 response instead of retrying with an empty body.
The fix should look something like:
r2 := req.Clone(req.Context())
if req.GetBody != nil {
var bodyErr error
r2.Body, bodyErr = req.GetBody()
if bodyErr != nil {
return nil, fmt.Errorf("reset request body for 401 retry: %w", bodyErr)
}
}
r2.Header.Set("Authorization", "Bearer "+token)
return t.base.RoundTrip(r2)
Also note: the first clone r := req.Clone(req.Context()) also shares the body, so after t.base.RoundTrip(r) consumes it, req.Body is also consumed. This is fine for the non-retry path but is the root cause of the retry issue.
Was this helpful? React with 👍 or 👎 to provide feedback.
Summary
Fixes #9 — OIDC access tokens expire during long coding sessions, causing heartbeats and event ingestion to silently fail.
What changed
Proactive refresh — The token source checks remaining TTL on every call. When less than 15 minutes remain (~75% of a typical 1h token), it refreshes ahead of time. This prevents latency spikes and avoids concurrent-refresh races on the heartbeat/event paths.
Reactive refresh (401 retry) —
bearerTransport.RoundTripretries once with a forced token refresh on 401. Handles clock skew, server-side revocation, and edge cases where proactive refresh didn't fire in time.Body consumption fix — The retry path now uses
req.GetBody()to obtain a fresh request body. Previously,req.Clone()shared the originalio.ReadCloser, so the retry would send an empty body (all ConnectRPC unary RPCs use POST).Token persistence — Refreshed tokens are saved to the system keyring so other processes and subsequent
kontext startruns see the new token.Changes
internal/backend/backend.go:TokenSourcetakesforceRefresh bool,RoundTripretries on 401 with fresh bodyinternal/run/run.go:newSessionTokenSourcehonorsforceRefresh,shouldProactiveRefreshtriggers at <15min remainingTest plan
kontext start --agent claudefor >1h, verify heartbeats continue after token expiry✓ Token refreshedappears in stderr when proactive refresh fires🤖 Generated with Claude Code